KMID : 0381120150370060489
|
|
Genes and Genomics 2015 Volume.37 No. 6 p.489 ~ p.501
|
|
Next-generation sequencing data analysis on cloud computing
|
|
Kwon Tae-Soo
Yoo Won-Gi Lee Won-Ja Kim Won Kim Dae-Won
|
|
Abstract
|
|
|
With the advent of next-generation sequencing (NGS), including whole genome sequencing (WGS), RNA sequencing (RNA-seq), and chromatin immunoprecipitation followed by sequencing (ChIP-seq), many biologists and computer scientists are highlighting the urgent need for computing power, storage, and various bioinformatics software for analyzing large quantities of sequence data. Currently, building the computational infrastructure required for massive data processing and providing maintenance services are among the most important tasks. However, technology platforms for handling large amounts of information pose multiple challenges for data access and processing. To overcome these challenges, cloud computing technologies are emerging as a possible infrastructure for tackling the intensive use of computing power and communication resources in NGS data analysis. Thus, in this review, we explain the concepts and key technologies of cloud computing, such as Map and Reduce, and discuss the problem of data transfer. To reveal the performance and usefulness of these technologies, we analyzed NGS data using cloud platforms and compared them with a local cluster. From the benchmark results, we concluded that cloud computing is still more expensive than local cluster, but provides reasonable performance for NGS data analysis with acceptable prices and could be a good alternative to local cluster systems.
|
|
KEYWORD
|
|
Next-generation sequencing, Cloud computing, Virtualization, Mapreduce, High performance computing
|
|
FullTexts / Linksout information
|
|
|
|
Listed journal information
|
|
|
|